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ABSTRACT 



Self adaptive filters adjust their parameters to perform 
an almost optimal filtering operation without apriori know- 
ledge of the input signal statistics . Two approaches to the 
design of efficient self adaptive discrete filtering algorithms 
are considered. 

For non-recursive (FIR) adaptive filters, simplified esti- 
mations of the gradient of the performance function to be 
minimized are considered. These algorithms result in reduced 
complexity of implementation, improved dynamic operating range 
with about the same misad justment errors and convergence time 
as the classic LMS (Lease Means Squared) algorithm. An analy- 
sis of the simplified gradient approach is presented and con- 
firmed experimentally for the specific example of an adaptive 
line enhancer (ALE) . The results are used to compare the 
simplified gradient approaches with each other and the LMS 
algorithm. This comparison is done using a new graphic pre- 
sentation of adaptive filter operating characteristics and a 
complexity index. This comparison indicates that the simplified 
gradient estimators are superior to the LMS algorithm for 
filters of equal complexity. 

For recursive (HR) adaptive filters a combined random 
and gradient search (RGS) algorithm is proposed, analyzed and 
tested. Since for the IIR filter, the performance surface is 
multimodal in the feedback parameters and unimodal in the 
feedforward parameters, random search is used to adjust the 
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feedback parameters and gradient search to adjust the feed- 
forward parameters . Convergence to the globally optimal 
filter parameters is guaranteed for sufficiently long 
adaptation time. Convergence time estimation for the RGS 
algorithm is derived and supported by simulation results for 
the ALE example. Finally, apriori knowledge of the optimal 
filter structure is taken into account in the formulation of 
an improved version of the basic RGS algorithm. This improve- 
ment is confirmed with the ALE example. 
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I. INTRODUCTION 



1.1. BACKGROUND 

In a broad sense the term filter implies an operation 
on an input signal or collection of data in order to smooth, 
predict, or estimate a desired property hidden in the input. 
Fig. 1.1-1 presents the block diagram of a discrete time 
linear recursive digital filter. An optimal filter is one 
designed to be optimum or best with respect to a performance 
criterion that measures or expresses its effectiveness. The 
most commonly used approach to optimal filter design is the 
linear filter optimized with respect to a Minimum Mean 
Squared Error (MMSE), where the error is defined as the 
difference between the filter output and a desired signal. 

This optimal filter is usually called the Wiener filter. 
Filter realization may be for: (a) analog signals and con- 

tinuous time, (b) analog signals and discrete time, (c) di- 
gital signals and discrete time. This dissertation is 
applicable to cases (b) and (c). A basic discussion of 
discrete Wiener filters is presented by Nahi [28, Ch 5]. 

As expected the parameters of the optimal filter depend upon 
properties of the input and desired signals. For example, 
the Wiener filter solution depends upon the second order 
statistics of the input signal and the desired signal. 
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Fig . 1.1-1 

Discrete Linear Filter 



10 



The performance surface describes the filter performance 
criterion as a function of its weights (parameters, coeffi- 
cients-a^, b^, of Fig. 1.1-1). Each point of the surface 
is the value of the performance criterion with specific 
weights of the filter. The term performance function will 
be used to describe the performance criterion values as 
function of time during the adaptation process . The optimal 
filter weights are those at the global minimum point of the 
performance surface. 

In those cases where the information (input statistics) 
needed to design an optimal filter is not available, or in 
those cases where the filter is required to operate under 
statistically nonstationary input signal conditions, the 
usual optimal design approach is not applicable. In some of 
these cases , a self adaptive filter can be used to overcome 
this lack of information. The adaptive filter tries to 
adjust its parameters dynamically to variations in the sta- 
tistics of the input signal. For the weight adjustment, or 
adaptation, the adaptive filter uses an error signal. 

Ideally this error is the difference between the filter out- 
put and a desired signal. In many applications the desired 
signal is not available per se, so that a reference signal, 
related to the desired signal in some way, is used to develop 
the error signal. Fig. 1.1-2 presents a block diagram of an 
adaptive filter with its input, output and reference signals. 
The adaptive filter thus includes a signal processing section 
which is similar to a non-adaptive filter, except that the 
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Fig . 1.1-2 
Adaptive Filter 



filter weights are adjustable and controlled by the second 
portion of the adaptive f ilter-namely the weight adaptation 
algorithm. The weight optimization algorithm typically 
estimates the gradient of the performance surface and 
adjusts the weights in the direction of steepest descent. 

For. a statistically stationary situation, after some tran- 
sient, the adaptive filter can be expected to reach a steady- 
state condition at which the parameters jitter around the 
minimum point of the performance surface . 

The generation of the reference signal is a key consi- 
deration in adaptive filter implementation . There are 
various practical methods as discussed in [1, 2, 3, 7, 22, 

24, 26, 29, 32, 37, 38, 39]. In many of these applications 
the reference signal is not identical to the signal we would 
like to have as output of the filter because if we had the 
desired output we wouldn't need the filter. In spite of 
the approximations involved, the adaptive filter is still 
able to operate and optimize the weights in many practical 
applications . 

This dissertation investigates two approaches to effi- 
cient adaptive filters. Chapter II discusses simplified 
gradient estimation methods for non-recursive filters and 
Chapter III discusses recursive filters based on a combined 
random and gradient search adaptation technique. 
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1.2 FIR* ADAPTIVE FILTERS 

The FIR filter is the simplest form of digital filter. 

The processing operation produces an output which in the 
linear sum of weighted delayed input samples. The impulse 
response of this filter is given by the sequence of values 
of the filter weights. Because of its relative simplicity, 
the FIR adaptive filter historically was the starting point 
for the development of adaptive filters . 

A very important property of this filter is that its 
performance surface is quadratic so we have one and only one 
minimum, i.e. it is a unimodal surface as shown by Widrow [1]. 
For a unimodal surface, a gradient minimum seeking algorithm 
will converge to the minimum (a formal proof is presented in 
[1]), and this .property is the key to the success of the 
Least Mean Squared (LMS) algorithm, discussed later. Inter- 
est in the area of adaptive filtering started in the late 
50* s and early 60' s. The most successful approach is Widrow' s 
LMS algorithm. Widrow in [1] presents the classic LMS algori- 
thm and summarizes most of the previous work on the subject. 
The LMS algorithm and its basic properties are presented 
later. In [3] Widrow et al introduces the concept of noise 
cancelling which uses a reference signal that is related only 
to the noise to estimate the noise portion of the input. The 



* FIR (.Finite Impulse Response) and HR (Infinite Impulse 
Response) are generally used by the signal processing community 
to denote non-re curs i.v e • and recursive' filters respectively 
and are so used in this work. It is noted though, that some 
recursive filters can have a finite impulse response. 
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output is produced by subtracting the noise estimate from 
the input signal. In [4} Widrow et al extended the analy- 
sis to non-stationary operation of the LMS algor ith. In 
this study they identify two sources for misadj ustment (a 
measure of the distance of the actual steady-state error 
from the optimal steady-state error) with nonstationary 
input signal. The first is due to gradient estimation 
errors (or gradient noise) which also exists with stationary 
inputs. The second cause of misadj ustment with a non- 
stationary input is due to the changing statistics , and 
results in a lag in updating the filter weights after the 
optimal solution. This analysis gives some insight to the 
problem and provides basic design information. In [5] 

Widrow and McCool present a random search FIR filter and 
compare it to the LMS algorithm. Using the unimodal pro- 
perty of the FIR filter they modify the random search al- 
gorithm so that high performance function value points 
(which in regular random search methods are discarded) con- 
tribute to convergence towards the optimum. Their con- 
clusion is that the LMS is a better algorithm; it converges 
faster and produces less steady-state misadj ustment . In 
[6] Widrow et al present versions of the LMS algorithm 
that operate on complex data. This concept has recently 
become important because of the use of adaptive techniques 
in the frequency domain, Dentino [16] and Zentner [17]. 
Lucky, [7], introduces a Minimum Magnitude performance 
criterion to derive an adaptive equalizer. Digital 
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communication systems use equalizers to reduce the inter- 
symbol interference in a communication channel. Lucky's 
solution involves transmission of a special training se- 
quence which is known at the receiver and is used there as 
the reference signal. Another interesting point in his 
solution is the use of quantized variables in the adaptation 
algorithm. 

Finally Frost [26], Owsley [29], Widrow et al [38], 
Griffiths and Jim [41] and many others discuss the use of 
the LMS algorithm for adaptive control of sensor beamforming 
arrays. We will not discuss these applications in this 
dissertation because of their specialized nature. However, 
it is noted that the simplified algorithms presented here are 
general and may be used to advantage in antenna arrays . 

From the references the importance of the LMS algorithm 
is very clear. Surprisingly enough, very little was done to 
improve the basic algorithm, the emphasis being primarily on 
applications of the concept. Gersho [40] discusses adaptation 
in a quantized parameter space. Gersho 's discussion is of a 
general nature, i.e. no specific performance criterion was 
assumed, and his main results is that for unimodal performance 
surfaces and deterministic gradient (i.e. no need for sto- 
chastic gradient estimation) , the quantized algorithm will 
converge to the neighborhood of the optimal solution. 

Noschner [27] is the only published attempt to derive 
computationally more efficient versions of the basic LMS 
weight adaptation algorithm, and these results have not 
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been used in practice. Griffiths and Jim in a recent paper 
[41] discuss a simplified adaptive system from another point 
of view. Their concern is to simplify the signal processing 
section in order to achieve high frequency operation. They 
propose a 3 level weight quantization, with no multiplica- 
tions in the signal processing portion. The resulting 
weight adaptation scheme is based on the LMS algorithm, and 
it is necessary to store past quantizations . Hence it is 
more complicated, but the goal of high frequency operation 
is achieved. 

Summary of LMS Algorithm 

Because of its importance, the LMS adaptive algorithm is 
presented here following the basic references [1, 2, 3, 4, 5]. 
The basic filter output is given by: 



where: k is the time index 

N is the number of filter weights 
a. 

a^(k) is the ith weight at time k 

xCk) = sCk) + nCk) is the input signal consisting of 
desired signal sCk) and additive noise n(k) . 
We want to minimize the performance function: 



N -1 

a 



y (k) = z 
i=0 



a^kjxfk-i) 



( 1 . 2 - 1 ) 



J(k) = E{e 2 (k)} = E[{y(k) - s(k)} 2 ] 



( 1 . 2 - 2 ) 
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where: e Ck) = yCk) - sCk) is the error 

s 

In order to perforin the adaptation algorithm we need the 
gradient of the performance surface: 

V (k) = i = 0,1, ...,N -1 (1.2-3) 

ct • da . a. 

1 1 



In practice we don't have J(k) since s(k) is not known nor 
do we have an ensemble of processes to perform the expecta- 
tion operation of (1.2-2). Thus we must use an estimate 
of the performance function: 



J(k) = e 2 (k) = {y (k) - r (k) } 2 



(1.2-4) 



where r(k) is a reference signal, not necessarily identical 
to s ( k) . 

The gradient estimate is given by: 



V. (k) 
a i 



9 J (k) 

3a . 
i 



9eJ(k) 

“ 9i“ 



2e r (k) 



9e r (k) 

3aT 



= 2s (k) 9 ^ k) - = 2e (k)x(k-i) 

J- da . IT 

i 



(1.2-5) 
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Using the gradient estimate of Cl. 2-5), the LMS weights 
adaptation are given by: 




i = 1,2 



3 • • • 3 




( 1 . 2 - 6 ) 



where y is the adaptation gain controlling the convergence 
and steady-state properties of the filter. 

Reference [4] assumes a stationary input with uncorre- 
lated samples and derives formulas for the stability region, 
convergence time, and misadjustment as follows. 

Stable convergence of the adaptation algorithm is limited to 
values of y given by: 

3. 



where R (m) = E {x(k )x(k-m) } is the autocorrelation function 

X X 

of the input. Equation (1.2-7) was derived using the mean 
of the gradient estimate. So, in practice, in order to be 
stable at all times we need 




(1.2-7) 



V- «1/.[N R (o)]. 
3- a xx ' J 
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The approximate Mean Squared Error (MSE) convergence time 
constant is given by 

X MSE =l7(:4u a R xx (o)] (1.2-8) 

The misadj ustment , M, is defined as the ratio of the excess 
Mean Squared Error (MSE), due to adaptive filter steady- 
state Jitter around the optimal solution, to the minimum 
MSE: 



M = J steady -state - J min = N R (o) 
J min K a a xx 

where 



(1.2-9) 



Jss = J steady-state = lim J(k) 

k -*■ 00 

J min = Jss ^-filter °^ Xxma ''‘] = Minimum MSE 



The misadj ustment estimate (1.2-9) was derived for an ideal 
reference signal, r(k) = s(k) , and does not apply to cases 
of noisy reference. 

The derivations in [1, 2, 3, 4, 5] are based upon the 
use of eigenvalue eigenvector analysis . To obtain practical 
estimation formulas the eigenvalues based equations are 
approximated by correlation functions. The analysis presented 
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in this dissertation makes the approximations at the start 
of the derivations and uses correlation functions throughout. 
The advantage of this approach is that it provides better 
insight into the nature of the approximations. 
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1.3 THE HR* ADAPTIVE FILTER 

An IIR filter uses previous output values to compute 
the present filter output: 



y(k) 



N -1 

a 

= Z 
i=o 



N, 



a. !x (k-i) + Z b. y 

l . , l J 

i=l 



(k-i) 



(1.3-1) 



Because of the feedback in (1.3-1) the impulse response may 
be infinite and is designated IIR. 

Because of inherent savings due to the use of previous 
calculated values (the existence of poles in the transfer 
function) , the IIR filter is the most efficient filtering 
scheme for many applications . 

Since it uses feedback, the IIR filter can be unstable. 

This presents a design problem for the conventional IIR 
filter, and a basic requirement for an .IIR adaptation algorithm 
is to assure that the resulting filter is stable. A second 
disadvantage of the IIR adaptive filter is the multimodal 
nature of its performance surface as discussed in section 3.1 . 

White [8] was the first to suggest the use of IIR struc- 
tures for an adaptive filter. He indicates a possible use of 
several performance criteria and derives the gradient ex- 
pression for the Minimum Mean Squared Error (MMSE) performance 
criterion. In [9], Stearns et al presents an all adaptive IIR 



* FIR (Finite Impulse Response) and IIR (Infinite Impulse 
Response) are generally used by the signal processing community 
to denote non-recursive and recursive filters respectively and 
are so used in this -work. It is noted though, that some 
recursive filters can have' a finite impulse response. 
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filter. Stearns' algorithm is rather complex, i.e. the number 
of operations (multiplications and additions) is proportional 
to a-^N^Nk+o^Nk , compared to the relative simplicity of 
the LMS where the number of operations is proportional to N . 
Stearns' algorithm is discussed later and its gradient esti- 
mation method is presented with details. 

In [10] Feintuch presents a much simpler adaptive HR 
filter which consists of two LMS adaptive sections , one controls 
the feedforward weights adaptation and the second controls the 
feedback weights adaptation. Feintuch ' s algorithm gradient 
estimation method is presented later on in this section. 
Feintuch' s algorithm works in some cases but, as pointed out 
by several investigators [11, 12], the derivation has errors 
and the filter, at least in the examples presented in [11], 
does not converge to the optimal solution. 

In [13] Parikh and Ahmed used the same examples presented 
in [11] to demonstrate the convergence properties of Stearns' 
algorithm. Reference [13] shows that Stearns' algorithm does 
converge to a minimum point, but with a multimodal performance 
surface the steady-state might be around a local minimum or the 
global minimum depending upon the starting point of the adapta- 
tion process. McMurray, [14], investigates the dependence of 
Feintuch 's algorithm stability on the values of its adaptation 
gains. The region of stable operation turns out to be a tri- 
angle in the adaptation gains space. In [15] McMurray inves- 
tigates the convergence time for Feintuch' s algorithm IIR 
filtering of narrow band signals and compares operation in the 
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time and frequency domains. In both cases the convergence 
time is inversely proportional to the square root of the 
multiplication of four factors: feedforward adaptation gain, 

feedback adaptation gain, number of feedforward weights, and 
the number of feedback weights. An additional conclusion was 
that the convergence time is shorter for the time domain 
operation. Parker and Ko, [18], extend the adaptive HR 
filter for image processing. In [35] Treichler, Larimore 
and Johnson modify Feintuch's algorithm by passing the error 
term through a FIR filter. This modification allows for con- 
vergence to a minimum (not necessarily global) , and its use is 
limited by the information needed for the design of the error 
term filter. The existing IIR adaptive algorithms are based 
upon Stearns’ and Feintuch's algorithms which are summarized 
briefly in the following. 

In order to have a practical adaptation method we use a 
performance function estimate: 



J(k) - e r 2 (k) = {y (k) - r(k )} 2 



(1.3-2) 



where r(k) is the reference signal and y(k) is given by 
(1.3-1) with weights a^(k) and b^(k) being a function of time. 
The gradient estimate is given now: 



<v 

Va 



3J(k) _ 0 3 £ r (k) _ 3y (k) 

i 317 " ~ r 3a ± z r 3a ± 



(1.3-3) 



the derivitive is not as simple as in (1.2-5) because 

del • 

1 

of the feedback terms such as b^yOe-j) present in y(k) . 
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The final form of the gradient estimates of Stearns' algorithm 
are given by: 



V... (k) = 2e (k) a. (k) i « 0,1,...,N -1 

a.^ X. 1 a. 


(1.3-4) 


where : 

N b 

a. (k) = X(k-i) + Z b. (k)a. (k-j) 
1 j=l 3 i 

and : 


(1.3-5) 


V b (k) = 2e r (k)B i (k) i = 1,2,..., N b 


(1.3-6) 


where : 

N b 

6- (k) = y(k-i) +1 b. (k)B-(k-j) 

j=l 3 i 


(1.3-7) 



Equations (1.3-4, 5, 6, 7) are the gradient estimates of 
Stearns' algorithm. Feintuch's, [10], algorithm uses only 
the first terms in the expressions for (1.3-5), and 3^ 



(1.3-7) and the resulting gradient estimates are: 




V (k) = 2e (k) x (k-i) 
a i r 


(1.3-8) 


V b (k) = 2e (k) y (k-i) 
i 


(1.3-9) 



With both algorithms the weights adaptation is given by: 



25 



a. (k+1) = a. (k) - y V (k) 

J- 1 3 3 • 




N -1 (1.3-10) 

3 . 



b^k+l) = b i (k) - y b V b (k) 



i 



i=l, 2 



r • • • r 



N. 



b 



(1.3-11) 



where y and y, are the feedforward and feedback adaptation 

3 D 

gains. These adaptive HR filtering schemes are not satis- 
factory solutions to the IIR filtering problem. Steam's 
algorithm is not satisfactory because of the following 
reasons : 

- The instability problem mentioned by Elliott, Jacklin 
and Stearns, [25]; this problem is discussed later. 

- The algorithm does not assure convergence to the global 
minimum . 

- It is a complicated algorithm. 

The Feintuch algorithm is not satisfactory mainly because, in 
some cases it fails to converge to a minimum point, and in 
all cases does not assure convergence to the global minimum 
of the performance surface . 
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1.4 INTRODUCTION TO ADAPTIVE FIR FILTERS USING SIMPLIFIED 

GRADIENT ESTIMATIONS 

The LMS algorithm is being used in many adaptive fil- 
tering applications [l-j-6, 16 , 17 , 22 , 24 , 26 , 29 , 32 , 34 , 37 , 
38, 39, 41], with satisfactory results. The possibility of 
using simplified algorithms, with hardware and time savings, 
has not received much attention. Gersho [40], and Moschner 
[27], and recently Griffiths and Jim [41] (which discusses a 
somewhat different problem of simplifing the signal processing 
portion with more complicated adaptation algorithm) appear 
to be the only publications in this area. All applications, 
except [41J, seem to select the classical LMS algorithm and 
not a simplified version. A possible reason for this fact 
might be the lack of confidence in the performance of a 
simplified algorithm, compared with the many satisfactory 
results obtained with the use of the LMS algorithm. This 
dissertation will demonstrate analytically, and by extensive 
simulation, the advantages and savings associated with the 
use of the simplified algorithms. One natural simplified 
algorithm investigated here is the use of a positive or nega- 
tive Fixed Step Correction (FSC) in the adaptation, instead 
of the LMS correction which is proportional to the value of 
the gradient. This gradient estimation is given by: 



V FSC (i,k) = SgntV^g (i ,k) }=Sgn{e (k) }Sgn{x (k-i) } (1.4-1) 
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where : 



1 if [•] > o 

Sgn[ • ] = { 

-1 if [•] < o 

The second algorithm investigated here is to use a modified 
FSC with the step size proportional to the magnitude of the 
error. This algorithm is called here the Simplified LMS 
(SLMS) , Moschner [27] called this the clipped LMS. The SLMS 
has the following gradient estimate: 

A 

V SLMS (i ' k) = £ r (k) Sgn{X(k-i) } (1.4-2) 

Chapter II discusses these algorithms and presents an analysis 
and simulation of adaptive FIR filter operation using these 
algorithms . 

The optimal Wiener filter depends upon the statistics of 
the input signal and the desired signal, the steady-state be- 
havior of an adaptive filter depends upon the corresponding 
statistics. Since the desired signal is not available for the 
adaptive filter, and it uses a reference signal which is only 
related to the desired signal, it is obvious that the pro- 
perties of this filter differ depending upon the application 
and manner in which the reference signal is provided. In 
Chapter II an adaptive FIR filter is used as an adaptive line 
enhancer (ALE) [3, 34, 37, 39 J which is a typical signal 
processing application and utilizes a noisy reference. 
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Appendix A describes the simulation details . 

The discussion in Chapter II includes for each algorithm 
the following topics: 

- convergence and stability, section 2.2. 

- convergence time (TC) , section 2.3. 

- steady-state misadjustment (M) , section 2.4. 

- implementation complexity, section 2.1. 

- dynamic range, seciton 2.6. 

Sections 2.3 and 2.4- include derivations of estimation 
formulas to the convergence time, TC, and misadjustment, M, 
of the FSC and SLMS algorithms . 

The simulation experiment, described in Appendix A, shows 
good agreement to these misadjustment and convergence time 
formulas . 

Fig. 1.4-1 presents a typical operation of the adaptive 
FIR filter with LMS, FSC, and the SLMS algorithms. This 
figure shows a typical weight, a^ , and the Mean Squared 
Error (MSE) , as a function of time for the three algorithms 
as noted. On each plot we have drawn the optimal value of 
the weight or the MSE, an ensemble average of 100 runs as well 
as the convergence of an individual filter (single run). In 
Fig. 1.4-1 all of the algorithms perform, on the average, 
about the same. 

For more accurate comparison, a graphic presentation of 
adaptive filter properties is introduced in Section 2.1. This 
graphic presentation, the Adaptive Filter Operating Charac- 
teristic (AFOC) is used to compare equal degree and equal 
complexity filters with different algorithms . 
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typical FIR adaptive filters operation with = 15, = .0005 
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= .001, ALE experiment 
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If one looks ahead to Fig. 2.1-1 it is apparent that the 
simplified gradient algorithms (FSC and SLMS) , when compared 
to the LMS algorithm with equal complexity (cost) and equal 
convergence time, are more effective and provide more pro- 
cessing gain (processing gain is defined later as a measure 
of filter effectiveness). 
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1.5 INTRODUCTION TO ADAPTIVE HR FILTERS USING RANDOM 

SEARCH TECHNIQUES 

Adaptive IIR filters based on gradient methods have one 
major disadvantage which is the multimodal structure of the 
performance surface as discussed in section 3.1. Thus there 
is no inherent way to assure a steepest descent gradient con- 
vergence to the global minimum. The convergence problem 
and additional disadvantages of Stearns' and Feintuch's 
algorithms, as discussed in section 1.3, suggests that gra- 
dient methods may not be the best adaptation scheme for the 
IIR filter. Thus a different adaptation technique, namely, 
random search, is considered here. The basic concept of 
random search is discussed in section 3.2. 

A random search IIR filter is presented and discussed 
in section 3.3. It is concluded there that this scheme is 
not satisfactory. The fact that the IIR filter's performance 
surface is quadratic in the feedforward weights (Elliott et 
al L 2 5 J ) is the key for the hybrid Random and Gradient 
Search (RGS) algorithm developed in section 3.4. This new 
algorithm provides for satisfactory operation of an IIR 
adaptive filter. Convergence analysis of the RGS algorithm 
and convergence time estimation for a typical signal pro- 
cessing situations is given in section 3.5. 

For cases where information is available on the structure 
of the optimal filter, a constrained, or apriori structured 
filter algorithm can be implemented. This concept is 
discussed in section 3.6 and shows good results. Fig. 1.5-1 
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presents the error convergence of four filters - the LMS FIR 
filter with 20 weights, a RGS HR filter, an apriori struc- 
ture adaptive pole (ASPOL, section 3.6) IIR filter, and 
Feintuch algorithm IIR filter. The IIR filters have two 
feedback and three feedforward weights . For this example it 
is seen that: 

1. The LMS algorithm converges fastest. 

2. The RGS converges slower but reaches a lower steady- 
state error. 

3. The ASPOL converges to the lowest steady-state error, 
faster than the RGS . 

4. Feintuch algorithm converges to the highest steady- 
state error. 

These examples are typical. 
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ERROR CONVERGENCE FOR SEVERAL ALGORITHMS 




TINE K 
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FEIWUCH ALGORITHM ty a -.«001 Mb"*®® 75 

ms xMa"*®C®l NA-20 

p is a parameter constant (pole magnitude) 

R is the number of output samples used in estimating the 
performance surface value for a fixed set of parameters 
(random search interval) , 



Fig. 1.5-1 

* 

Error Convergence For Several Adaptive Filters 
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II. ADAPTIVE FIR FILTERS USING SIMPLIFIED 
GRADIENT ESTIMATIONS 



2.1 TWO SIMPLIFIED GRADIENT ALGORITHMS 

Two simplified gradient algorithms are considered: 

(a) The Fixed Step Correction (FSC) adaptation scheme 
is given by: 



a i (k+l) = a i (k) - u a Sgn{V (i, k) } (2.1-1) 



This formulation is essentially binary and was motivated by 
the general success of bang-bang type controllers. The 
adaptation gain y is the size of the fixed correction step. 

< 3 . 

We define the FSC gradient estimate as : 

V FSC = Sgn ^ V LMS^’ k U = Sgn Sgn {x(k-i)} 

( 2 . 1 - 2 ) 

It should be noted that the sign of the gradient, Sgn {V (i,k)} 
Sgn (x(k-i)} Sgn. (e(k)}, is identical for both error magnitude 
and mean square error estimates , that is 

2 

Sg n { --L -L} = Sgn { — } 5 so that (2.1-2) can be 
da i a i 

derived from either error magnitude or mean squared error. 

Large correction steps result in fast convergence to the steady 
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state near the optimal filter weights with a large steady- 
state jitter around the optimal filter. These contradicting 
effects call for engineering compromise in choosing the size 
of the correction step y^. 

(b) The second approach is to use a variable size 
step. A natural possibility is to consider 

P a = M ' | e (3c) | (2.1-3) 

The combination of (1.2-2), (2.1-1), and (2.1-3) gives: 

a i (k+l) = a i (k) -y ' | e (k) | Sgn {s (k) }Sgn{x (k-i) } (2.1-4) 

We can use the regular adaptation gain symbol y instead of 

0 . 

y ' and write 



a.(k+l) = a.(k)-y e (k) Sgn (x(k-i)} 
1 10 . 



(2.1-5) 



(2.1-5) is the simplified LMS (SLMS) algorithm with the 
gradient estimate given by: 

A 



V SLMS (i,k) = £(k) Sgn {x(k_i)} 



1 - 6 ) 
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Typical operation of the LMS , FSC and SLMS algorithms are 
presented in Fig. 1.4-1. 



A useful graphic presentation of adaptive filter proper- 
ties is given by a plot of processing gain (PG) as a function 
of convergence time (TC) . The processing gain measures the 
filter effectiveness and is defined as: 



where R nn (o) is "the input noise power and J ss , defined in 

(1.2-9), is the output error power. 

The convergence time, TC, is the time required to reduce 

90% of the initial excess MSE . The value of the performance 
function at the time TC is: 



This plot, named the Adaptive Filter Operating Character- 
istic (AFOC) , can be used for design when the number of filter 



comparison for different adaptation schemes. Curves for the 
LMS, FSC, and SLMS algorithms are presented in Fig. 2.1-1A 
for the ALE experiment of Appendix A . 

We define the following complexity index (CF) for com- 
paring adaptation schemes . 



PG = 10 log[R ' (o)/J ] 

6 nn ss 



(2.1-7) 



J(TC) = J +0.1 [J(o) - J ] 
ss ss 



( 2 . 1 - 8 ) 



weights, N , is a parameter. It also provides a method of 

3 . 




(2.1-9) 
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Fig. 2.1-1 

AFOC Comparison, ALE Experiments 
Average Of 100 Runs 
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where N^ UL , N^ DD , are the number of multiplication, 

addition and control operations used in one iteration, , 
ot g are weighting coefficients representing the cost of 
each operation. A reasonable approximation which neglects 
control operations is 



CF a N MUL + n add 



( 2 . 1 - 10 ) 



Using the equations for LMS , FSC and SLMS techniques we have 
the following complexity indices as a function of the number 
of the filter weights, 



CF LMS = (2N a + 1)a + 2N a + 1 



( 2 . 1 - 11 ) 



CF FSC = N a“ + 2N a + 1 



( 2 . 1 - 12 ) 



CF SLMS = (N a + 1)a + 2N a + 1 



(2.1-13) 



As a reasonable numerical example, using a = 5 , we have 
approximately equal complexity with N^g = 6, N pc , r = 11, 



FSC 



^SLMS = com P ar ison for this complexity is 

presented in Fig. 2.1-1B and indicates that for a given 
convergence time the simplified gradient methods provide 
higher processing gain. 
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2.2 CONVERGENCE AND STABILITY 



In this section we discuss conditions for the convergence 
and the stability of the simplified gradient estimates. A 
stable adaptive filter is one that converges to a near 
optimal steady-state. We now define the convergence ratio, 

C. (k) : 

1 * 



C i (k) = 



* 



a i (k+l) 
a i (k) - 




( 2 . 2 - 1 ) 



where a^ is the optimal value for the weight a^ . 

Following Widrow et al [1, 3, 4] we define the weight 
noise, (k) , as : 



V i (k) = a i (k) - a* (2.2-2) 

Combining (2.2-1) and (2.2-2) gives: 

V. (k+1) 

c i (k) * V (k) <2 - 2 - 3 > 

From (2.2-2) and (1.2-6) we get: 



V. (k+1) = V. (k) - u V . (k) 

X X a. a.^ 



Combining (2.2-3) and (2.2-4) we get: 



W k) 

C i (k) “ 1 u a V. (k) 



(2.2-4) 



(2.2-5) 



The steady state average convergence ratio is defined as : 

V ai (k) 

= E{C i (k)} = l-u a E (2.2-6) 

where k is large enough for operation of the filter to be in 
steady-state. From this point we proceed with the specific 
case of the SLMS, with its gradient estimate given by (2.1-6). 
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The error as function of the weight noise is given from (2.2-2) 
and (1.2-4) as: 

N -1 

* ^ 

s (k) = e (k) + £ V. (k) X (k-j) (2.2-7) 

r r j=0 ? 



where e r (k) is the optimal error at time k and is given by: 



e* (k) 
r 



N -1 
a 

£ 

i=0 



a. X(k-i) - r(k) 



( 2 . 2 - 8 ) 



Inserting (2.2-7) to (2.1-6) results in the equation: 



V SLMS (i,k) = £ r (k) Sgn {x < k-i)} + 



N -1 
a 

+ £ V.(k)X(k-j) Sgn{x(k-i)} 

j = o 



(2.2-9) 



Inserting (2.2-9) into (2.2-6) we get: 



e (k) Sgn {x (k-i) } 
C i= 1 ' "a {E { ~ - V— k) } 



N a -1 V. (k) 

+ £ E y • X (k-j) Sgn' (x(k-i) }}} (2.2-10) 

j=o i v 



e r (k) is independent of x(k-i) and of (k) so that: 
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vTTk) 



} = E{e*(k) }E{ S S n * X |^ l) } }=0 (2.2-11) 



because E[e ^(k)] = 0. 

To continue with the simplification of (2.2-10) we make the 
following assumptions : 



Assumption (a) is similar to the uncorrelated input assump- 
tion used by Widrow in [1] and seems to be justified by his 
results. Assumption (b) is made for mathematical convenience 
and can be justified by the dependence of the weight noises 
on the common error terms and the uniform statistics of the 
input signal over the filter memory. 

Using (2.2-11) and (2.2-12) in (2.2-10) we get: 



(a) V^(k) and x(k-j) are uncorrelated 

(b) E{Vj (k) /V ± (k) } = 1 



} 



( 2 . 2 - 12 ) 



C. 

l 



N -1 
a 

1 - u E E{x(k-j) Sgn [x(k-i)]} 

a • 

l-o 



(2.2-13) 



Since x(k-j) Sgn {x(k-i)} <c |x(k-j)| we can write: 



N -1 
a 



C. < 1 

l — 



P a E E | x(k-j ) | 



j=o • 



(2.2-14) 



For stationary input signals E{x(k-j)} = E{x(k)} with all 
values of j and we get : 



(2.2-15) 



C. < 1 - u N E | x(k) | 

1 — a. a. 



For stable operation, as well as for convergence to the 
optimal weight values, we require 



C. < 1 

l 1 



(2.2-16) 



Manipulating (2.2-15) and (2.2-16) to obtain the stability 
condition for y yields for the SLMS algorithm: 

cl 



0 < y SLMS < N E | x (k) I 



(2.2-17) 



To express (2.2-17) as function of the input power, R xx (°)/ 
we can define the input signal form factor, F , as: 



F = E|X(k) | // E {X 2 (k) } 



Now inserting (2.2-18) in (2.2-17) results in: 



0 < ^SLMS < 



N F / R (0) 
ax xx 



(2.2-18) 



(2.2-19) 



For the LMS algorithm we can use (2.2-6) and the LMS gradient 
estimate. Following the above derivation and using the 
assumptions of (2.2-12) we get 



0 < n < — 

m LMS N R 

a xx 



( 0 ) 



( 2 . 2 - 20 ) 
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(2.2-20) is equivalent to equation (32) in [4] which was 
derived in a diferent manner but with similar assumptions. 

To derive the stability region of the FSC algorithm we 
use (2.2-17) and the relationship between the FSC and the 
SLMS algorithms, we define an equivalent adaptation 
gain, y , by the formula 

“FSC - “eq E l s r (k) l (2.2-21) 

It is interesting to note that we are now using the deriva- 
tion process of Section 2.1 for the SLMS algorithm in a 
reverse direction. The case of greatest interest is that of 
a low signal to noise ratio. For this case we use the 
following approximations : 

e r (k) = y(k) -s (k+1) -n (k+1) ~-n (k+1) ~-r (k) 

and 

E|e r (k) |=E|r(k) I = E | x (k) I (2.2-22) 

Inserting y e g from (2.2-21) into the SLMS relation, given by 
(2.2-17), with the use of (2.2-22) results in the following. 

0 < „ FSC < 2/N a (2.2-23) 

The foregoing relationships (2.2-17), (2.2-20) and (2.2-23), 

are based upon average behavior of the algorithms . In 
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practice, to avoid numerical overflow, we must use adaptation 
gain values much smaller than the upper limit indicated in the 
above relations. An additional consideration that also results 
in a smaller adaptation gain is the misadjustment . For all 
the algorithms , the use of the upper bound value for the 
adaptation gain results in a misadjustment of the order of the 
optimal filter gain (PF) , which means that practically we are 
restricted to much lower values of the adaptation gain. The 
results of this section are included in Table 2.6-1. 
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2.3 CONVERGENCE TIME ESTIMATION 

In order to estimate the convergence time of an adaptive 
filter one may visualize the process as changing the weights 
with some average step. A, taken in most of the iterations 
towards the optimal value of the weight. Assuming an initial 
value of zero for all the weights, the longest convergence 
time will be associated with weight having the largest abso- 
lute value, a . From the above it is reasonable to assume 
max 

the following relationship: 



TC 



a. 



max 




(2.3-1) 



where: N a is the number of filter weights, and a^, are 

unknown coefficients. a /A is the exact number of steps 

in 3.x 

needed for convergence if the correction is always in the 
right direction. In practice the gradient estimation causes 
errors in the direction, and the number of iterations re- 
quired to converge to the optimal value of the weights is 
modified by a factor that depends in some non-linear way on 

the number of weights N . This modification is represented 

a 2 

in (2.3-1) by the factor N . Also depends on the 

exact definition of TC (i.e. 10% or e ^ of the initial error 
squared) . Filter operation involves a linear combination of 
input values. Since the reference amplitude is independent 
of N , when we combine more input samples the relative weight 
associated with each sample should be smaller, mathematically 
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(2.3-2) 



a max N 

a 

In general, , depends upon the input signal to noise ratio 
as discussed in the literature [3, 3 3 1 . This dependency is 
not taken into account in the derivations which follow in 
order to simplify comparison of the new algorithms with 
existing algorithms. The results of reference C 37 H can be 
used to modify the results presented here to include the de- 
pendence upon input signal to noise. 

When looking at specific applications, such as Adaptive 
Line Enhancement (ALE) , one can determine the value of in 
(2.3-2) exactly. Inserting (2.3-2) to (2.3-1) and absorbing 
into a^, we write: 



A in (2.3-3) depends on the adaptive scheme. It is the 
fixed step size in the FSC algorithm and an average step 
size for the LMS and the SLMS algorithms . Thus for these 
three cases we define: 




TC 



AN 



(2.3-3) 



a 



A FSC U FSC 



(2.3-4) 



A LMS E ^ y LMS 7 LMS ^ 2y LMS E £ ( k ) x ( k- i) U ( 2 - 3 5 ) 



A SLMS y SLMS V SLH^'^ M SLMS E ^ Sgn i) H ) (2.3 6) 



SLMS SLM; 
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Using (2.3—3) and (2.3-4) and the empirical coefficients a^= 
1.65 a 2 = 1/2 as evaluated using the simulations described in 
Appendix A, we get the following FSC convergence time to 10% 
of the initial squared error: 



TC = 



1.65 
^FSC *^a 



(2.3-7) 



where TC is the time required to reduce the error to 10% as 
defined by (2.1-8). Fig. 2.3-1 presents a verification of 
(2.3-7) using simulation results with several values of PpgQ? 
N , and the input power R ( 0 ) . 

The significance of these results is that they confirm 
that the convergence time is inversely proportional to the 
adaptation gain and the square root of the number of weights . 
Assuming in (2.3-5) that • 



E { | e (k) | * | x (k-i) | } = E | e (k) | - E | x (k-i) | we get: 



A 



LMS 



2y LMS E l £(k) ' E | x (k-i) | 



(2.3-8) 



At the start of the adaptation process the initial weights 
have a value of zero, so that y(0) = 0 and e(0) = r(0). For 
the correlated reference case the reference power is essen- 
tially the same as the input power and we have: 
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Fig. 2.3-1 

Fixed Size Correction Convergence Time 
Theory And Simulation Results 
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a lms = 2,, lms E|x(k) 1 ' E|x(k) 1 



(2.3-10) 



Using expression (2.3-10) for the LMS average step size in 
(2.3-3) with a ^ = 1/2 we have 



a 



TC 



LMS 



LMS 



l, LMS (E l x(k) l> 



(2.3-11) 



In [4] the classical LMS convergence time estimate is given 
by : 



TC 



In 10 



“ S " 4 “LMS **7^ 



(2.3-12) 



Based on the simulation described in appendix A we select 
a LMS = * 555 ‘ 

Fig. 2.3-2 presents a comparison of simulation results 
with the classical convergence time formula, (2.3-12), and 
the new convergence time formula, (2.3-11). This figure 
indicates clearly that the convergence time depends upon the 
number of weights, N , as developed in (2.3-11), and that 

3i 

this formulation is more accurate than that of (3.2-12) which 
was developed in reference [4]. In a similar way (2.3-6) and 
(2.3-3) gives 

a. 



TC 



SLMS 



SLMS 



“SLMS ^a^* 00 



(2.3-13) 
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Fig. 2.3-2 

LMS Convergence Time, Simulation Results And Theory 
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Fig. 2.3-3 presents a comparison of (2.3-13) with simula- 
tion, results , with <* 5^5 = 1.4, based upon the results of 
simulations described in Appendix A, The comparison confirms 
(2.3-13). The key formulas of this section are included in 



Table 2.6-1, 
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Fig. 2.3-3 

SLMS Convergence Time, Simulation Results And Theory 
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2.4 STEADY STATE ERROR AND MISADJUSTMENT 



In order to evaluate the steady-state error we start with 
general relationships. First following [1, 4, 3] we define the 
weight noise v^(k) as: 



v. (k) = a. (k) - a* 
l i i 



(2.4-1) 



where a^ is the optimal ith weight . 



y (k) = 



V 1 

z 

i=0 



a i (k) x(k-i) 



N -1 
a * 

Z a. x(k-i) 
i=0 1 



N -1 
a 

+ I v. (k) x(k-i) (2.4-2) 
i=0 1 



Define 

N a _1 

e s (k) = y(k)- s (k) = Z a.(k) x(k-i) - s (k) 

i=o 1 



N -1 
a 

+ Z v. (k) x(k-i) (2.4-3) 

i=0 



We can now define the optimal instantaneous error: 



e *(k) 



a * 

Z a. x(k-i) 
i=0 1 



s (k) 



(2.4-4) 



Using this value the minimum mean squared error is: 
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